NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Simple Lifelong Learning Machines

Dey, Jayanta; Vogelstein, Joshua T; Helm, Hayden S; LeVine, Will; Mehta, Ronak D; Tomita, Tyler M; Xu, Haoyin; Geisa, Ali; Wang, Qingyang; van_de_Ven, Gido M; et al (July 2025, IEEE transactions on pattern analysis and machine intelligence)

Free, publicly-accessible full text available July 30, 2026
Distributionally Robust Optimization with Bias and Variance Reduction

Mehta, Ronak; Roulet, Vincent; Pillutla, Krishna; Harchaoui, Zaid (May 2024, OpenReview)
OpenReview (Ed.)
We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and f-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-k loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparameter, and prove that it enjoys linear convergence for smooth regularized losses. This contrasts with previous algorithms that either require tuning multiple hyperparameters or potentially fail to converge due to biased gradient estimates or inadequate regularization. Empirically, we show that Prospect can converge 2-3× faster than baselines such as stochastic gradient and stochastic saddle-point methods on distribution shift and fairness benchmarks spanning tabular, vision, and language domains.
more » « less
Full Text Available
Independence Testing for Temporal Data

Shen, Cencheng; Chung, Jaewon; Mehta, Ronak; Xu, Ting; Vogelstein, Joshua (May 2024, Transactions on Machine Learning Research)

Full Text Available
Simple Lifelong Learning Machines

https://doi.org/10.1109/TPAMI.2025.3595364

Vogelstein, Joshua T; Dey, Jayanta; Helm, Hayden S; LeVine, Will; Mehta, Ronak D; Tomita, Tyler M; Xu, Haoyin; Geisa, Ali; Wang, Qingyang; van_de_Ven, Gido M; et al (November 2025, IEEE Transactions on Pattern Analysis and Machine Intelligence)

Free, publicly-accessible full text available November 1, 2026
Stochastic Optimization for Spectral Risk Measures

Mehta, Ronak; Roulet, Vincent; Pillutla, Krishna; Liu, Lang; Harchaoui, Zaid (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)
Ruiz, Francisco; Dy, Jennifer; an de Meent, Jan-Willem (Ed.)
Spectral risk objectives – also called L-risks – allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task. We develop LSVRG, a stochastic algorithm to optimize these quantities by characterizing their subdifferential and addressing challenges such as biasedness of subgradient estimates and non-smoothness of the objective. We show theoretically and experimentally that out-of-the-box approaches such as stochastic subgradient and dual averaging can be hindered by bias, whereas our approach exhibits linear convergence.
more » « less
Full Text Available
EFFICIENT DISCRETE MULTI-MARGINAL OPTIMAL TRANSPORT REGULARIZATION

Mehta, Ronak; Kline, Jeffery; Lokhande, Vishnu Suresh; Fung, Glenn; Singh, Vikas (February 2023, OpenReview.net)

Optimal transport has emerged as a powerful tool for a variety of problems in machine learning, and it is frequently used to enforce distributional constraints. In this context, existing methods often use either a Wasserstein metric, or else they apply concurrent barycenter approaches when more than two distributions are considered. In this paper, we leverage multi-marginal optimal transport (MMOT), where we take advantage of a procedure that computes a generalized earth mover’s distance as a sub-routine. We show that not only is our algorithm computationally more efficient compared to other barycentric-based distance methods, but it has the additional advantage that gradients used for backpropagation can be efficiently computed during the forward pass computation itself, which leads to substantially faster model training. We provide technical details about this new regularization term and its properties, and we present experimental demonstrations of faster runtimes when compared to standard Wasserstein-style methods. Finally, on a range of experiments designed to assess effectiveness at enforcing fairness, we demonstrate our method compares well with alternatives.
more » « less
Full Text Available
Deep Unlearning via Randomized Conditionally Independent Hessians

Mehta, Ronak R.; Pal, Sourav; Singh, Vikas; Ravi, Sathya N. (January 2022, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition)

Full Text Available

Search for: All records